{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "77ac18e5",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "# GPT模型原理详解\n",
    "\n",
    "本教程将帮助你理解GPT (Generative Pre-trained Transformer) 模型的核心原理，这是继Transformer之后的重要发展，也是理解BERT的重要基础，更是大模型的前身。\n",
    "\n",
    "**学习目标：**\n",
    "- 理解GPT的架构设计和创新点\n",
    "- 掌握GPT的预训练与微调方法\n",
    "- 明确GPT与原始Transformer的区别\n",
    "- 为学习BERT做好铺垫\n",
    "\n",
    "![GPT模型图示](./mask.png)\n",
    "\n",
    "### Encoder: 全局编码\n",
    "- Encoder允许每个位置的Token与序列中所有其他Token进行自注意力计算\n",
    "- 没有遮掩(mask)限制，可以完整理解输入序列\n",
    "- 输出隐藏向量供后续处理\n",
    "- 不区分\"过去\"和\"未来\"的概念\n",
    "- 典型应用：机器翻译中的源语言编码、文本分类中的向量化\n",
    "### Decoder: 自回归生成\n",
    "- 目标是从左到右顺序生成序列\n",
    "- 使用look-ahead mask屏蔽未来Token，保证自回归特性\n",
    "- 位置i的Token只能关注位置≤i的Token\n",
    "- 典型应用：语言模型预测、对话系统、机器翻译目标语句生成\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c09e48e4",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 1. GPT简介\n",
    "\n",
    "GPT（Generative Pre-trained Transformer）是一种基于Transformer架构的生成式预训练语言模型，由OpenAI开发。从名称可以看出：\n",
    "\n",
    "- **Generative**（生成式）：能够生成连贯的文本\n",
    "- **Pre-trained**（预训练）：在大规模无标签文本上进行预训练\n",
    "- **Transformer**：基于Transformer架构设计\n",
    "\n",
    "GPT系列的发展：\n",
    "- **GPT-1** (2018)：首个版本，证明了预训练+微调的有效性\n",
    "- **GPT-2** (2019)：规模更大，参数达到15亿，生成能力显著提升\n",
    "- **GPT-3** (2020)：参数达到1750亿，能够完成各种复杂任务\n",
    "- **GPT-4** (2023)：多模态能力，理解力和推理能力大幅提升\n",
    "\n",
    "理解GPT的架构和原理对于深入学习后续的BERT等模型具有重要意义。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "437e98e0",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 3. GPT的核心创新：预训练+微调范式\n",
    "\n",
    "GPT模型的重大贡献是确立了**预训练+微调**的范式，这种方法极大地改变了NLP领域的发展方向。\n",
    "\n",
    "### 3.1 无监督预训练（Unsupervised Pre-training）\n",
    "\n",
    "GPT首先在大规模无标签文本上进行预训练，学习语言的通用表示。预训练任务是**自回归语言建模**：\n",
    "\n",
    "$$P(x) = \\prod_{i=1}^{n} P(x_i|x_1, x_2, ..., x_{i-1})$$\n",
    "\n",
    "简单来说，就是预测序列中下一个词的概率。例如：\n",
    "\n",
    "> \"今天的天气真是\" → 预测下一个词可能是\"好\"、\"糟糕\"等\n",
    "\n",
    "通过预测大量文本中的下一个词，模型学习到了语言的语法、语义和知识。\n",
    "\n",
    "### 3.2 有监督微调（Supervised Fine-tuning）\n",
    "\n",
    "预训练完成后，针对特定任务（如分类、问答等）使用少量标注数据进行微调：\n",
    "\n",
    "1. 保留预训练模型的参数作为初始化\n",
    "2. 添加适合下游任务的输出层\n",
    "3. 在标注数据上训练，对所有参数进行更新\n",
    "\n",
    "这种方法的优势是：\n",
    "- 充分利用大量无标签数据\n",
    "- 减少对标注数据的依赖\n",
    "- 提高模型在各种任务上的表现\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dcc8c9ee",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 4. GPT的详细架构\n",
    "\n",
    "GPT-1的架构相对简单，但清晰展示了其核心思想。下面我们来详细了解GPT的架构设计：\n",
    "\n",
    "### 4.1 整体架构\n",
    "\n",
    "![GPT架构](./gpt-1.png)\n",
    "\n",
    "GPT由多个相同的Transformer解码器层堆叠而成：\n",
    "- **GPT-1**：12层Transformer解码器\n",
    "- **GPT-2**：根据不同大小有12-48层\n",
    "- **GPT-3**：96层Transformer解码器\n",
    "\n",
    "### 4.2 单层结构\n",
    "\n",
    "每一层Transformer解码器包含：\n",
    "\n",
    "1. **掩码多头自注意力**：确保信息只能从左向右流动\n",
    "2. **位置前馈网络**：对每个位置独立应用相同的全连接层\n",
    "3. **层归一化和残差连接**：确保信息顺畅传递和训练稳定性\n",
    "\n",
    "```python\n",
    "# 简化的GPT单层结构伪代码\n",
    "def transformer_decoder_layer(x, mask):\n",
    "    # 自注意力子层\n",
    "    attn_output = multi_head_attention(x, x, x, mask)\n",
    "    x = layer_norm(x + attn_output)  # 残差连接和层归一化\n",
    "    \n",
    "    # 前馈网络子层\n",
    "    ffn_output = feed_forward_network(x)\n",
    "    x = layer_norm(x + ffn_output)  # 残差连接和层归一化\n",
    "    \n",
    "    return x\n",
    "```\n",
    "\n",
    "### 4.3 输入表示\n",
    "\n",
    "GPT使用学习得到的词嵌入+位置编码来表示输入：\n",
    "\n",
    "- **词嵌入**：将每个token转换为固定维度的向量\n",
    "- **位置编码**：添加位置信息，让模型知道每个词的位置\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "6e8224ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import math\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "# 设置中文字体\n",
    "plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签\n",
    "plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9020320a",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 5. 实现GPT的关键组件\n",
    "\n",
    "为了更好地理解GPT模型，我们将实现其关键组件。这里我们简化了实现，但保留了核心思想。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a851cca3",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### 5.1 掩码自注意力\n",
    "\n",
    "掩码自注意力是GPT的核心组件，确保模型只能看到过去的信息。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "c3fc4eb1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAloAAAIhCAYAAACWvhToAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAATMtJREFUeJzt3Ql8VOXV+PFzZ4YkREhMIgWFADGCFjUuGIkLKAVcQFCUqlT7IioulNalpajVumADuGC1ilpIA7gvr1ZQcZe3VgEBMRiDWAExLlj6N2SCwcSQ+X/Og5NOQoCZODczz8zv6+eW3MnMnTMPt+TkPOc+1wkEAgEBAABA1Hmif0gAAACQaAEAALiIihYAAIBLSLQAAABcQqIFAADgEhItAAAAl5BoAQAAuIRECwAAwCUkWgAAAC7xuXVgALHz7rvvyvz582Xq1KmSlZUV1mv+/e9/i8/nk+zs7GaP/+c//5Ha2lrp2bPnbl9/2223yVFHHSU/+9nPmj1eXV0txx9/vNx5551y0kkn7fQ6vTlFXV2dpKamiuM45rHt27fLd999J//6179k2bJlkpGRIR06dGj2uhNOOEG6dOnS7LGrrrpK0tPT5dZbb206VrR8+OGHsmDBAjnvvPP2OBYAEESiBSQgTVDuu+8+ue6668JOtJ5++mmZMmWKbNq0Sfbaa6+mx19++WX5n//5H1m9erUcfPDBu3z93//+950SH/X+++/LmjVr5Mgjj2z1dRs3bpS8vLxWv/fII4/Ic889Z5Iwr9drHvP7/fLqq6/Kxx9/3Oz9NGF7/PHHTayhSdbmzZvl8ssv3+nY+nxN5q655hoZOHCg+ZyzZ89u9pyZM2c2JVWLFy+WP/7xj3L22WfvcgwAoCWmDgGXzZkzxyQSHTt2lBEjRsiXX35pHj/xxBNNQqCbVpIOPPBAU/VpbGw037/pppuavt9y0x/6u6PHC/0zHBUVFTJ06NBmSZbSBOSYY47ZbZKlUlJSmqpOTz31lKlC5eTkyGmnnWY+00EHHST77LOP2fR7weqWJjI1NTVSX18vP/3pT+Wvf/2reb5W0X7xi1+YfR0XTQR1u/76683napmcvf322yZJPOWUU+STTz4xyWZ5ebmpjv3v//6vdOvWTYqKipptxx13nIkxmPDpuOrrNZYVK1aYv7P33nvPHOf555+XQYMGyffffy8fffSR2crKymTdunVhjzGA5ENFC3DRs88+a6opmihoIvXb3/5Wzj//fHnjjTfM9zWBueeee2Tbtm2ydOlSk0Qofd4ll1xikhQ1atQok5hdffXVZl+PFaQVnmByFqSJgtJkZcuWLTslRDq9FqTJydatW03ladiwYSZJCSZAehz9DJp8PPDAA02v0UTn4osvNl9rVSg0odPXBBNKja1fv34mQdLP2qtXL/M++++/vzQ0NJjnezwe6dSpk5k+1OSooKDAJJOa5Ch97bRp0+Trr782+3rc3NzcnZLI+++/3/zZcuryq6++Mn+ee+65ZgpzV3RcNA79XC+99JKpXL344osyYcIE8176d5SZmWkStCBN4vTvRitvANCqAADXFBQUBC666KKm/ddeey2g/7crKysLnHDCCYGTTz652fOvuOKKQG5u7k7H6dWrl/lea/bdd19zzHC3MWPGNHv92LFjW31eRUVFYPbs2YEOHToE+vfv37T17ds30Llz56bXn3HGGYHMzMyA1+sNpKWlmec///zz5nvnnHNOYPDgwYHGxkazf+aZZwZGjBjRtB/q1VdfDaSmpga+++67Zo//4Q9/CJxyyilN+3feeWfgxBNPbPac999/37z/U089FSgpKQnk5+cHampqAl9++WVg+/bt5vO89dZbgd156KGHzDhrbF26dAksXbq06XszZ84MHHLIIbt9PQC0hqlDwCVffPGF6WvSalBQ//79zZ86FdUanV77/PPPmypS4dBKzI033mh6joKbVqFUVVVVs8eHDBli+p1C6f64ceOanlNZWWke154orcRpZU2n0YKbVuD0PYP0vbRqptUi7XEKVrRuv/12U7mbN2+e6Z165ZVXTMP8P//5T7n00ktNJSyUVpEKCwt3iu///b//ZypgoeOqlbFQWu2aOHGijBkzxlTIlFan9t1336b98ePHy+GHH95s09haWrVqlfnsRx99dLPp3yuvvNK8t36tU50AEA4SLcAl2gCuQpOEvffeWz744INmyVfLaS696i80kdmTYJN4uIKJx55er0mTxqMN4KG0R6llMhRKpwCvuOIKueWWW8xVejrNp/1N2kOVn59vpuN0OlCnFNeuXWteo9OIjz76qLzzzjuSlpZmttGjR5vvaQyaMAXp1GGPHj2avaf2e2kC2JImczq9p/7whz/I3Llzmza9ylKnA1v6xz/+YXrFgg31zzzzjEn2dMpX+7F0KvHbb7/d5ecHgFD0aAEu0WURlDZ+hzrkkEN2em6wR0v7oM4555yI3ifSZQxaJlrBREf7tIJ9XeqCCy6QM8880zSja4O8Lp2gtGKliVBr9HMsXLjQVJO0j0z7yvS5ekxN6O6++27z9WWXXSadO3du6jXTREwTKm3G14RIEzSt7CmtImklLthrphU3rXzpfjApO+uss0zFTN9D49M4guOiVyiqAw44wFSxWvZktXTRRRfJgw8+aBKsQw89VH71q1+ZSpm+ryZnwRi0/0wTOU2kWzsOAJh/cxkGwB1a+QlNbPSqt+BVg1pRCV7Rp/vanK7JhDZya+N3JILLFGjiEdyCFRddwyr08WADekvazK2JT2jyo8mVNrBrIvOnP/2p6ZgtEy2t3P3tb3+T9evXy29+8xtzhZ8+XxNKXTpB31evHtQESr/WJnNttNfnKK046dSn6tq1q0nO+vbt2zRummjpcXWZCt20MqYXC+jXmgSqSZMmmUqUNtpfeOGF5jNo9enNN99sduFAOHQMbr75ZlOV06RQLxbQpTL69OkjP//5z81ztCI5YMAAk/CtXLkyouMDSC5UtACXBKscwQTl9ddfN8nX8OHDm55z7LHHmh/iweUKWi6tEA694nDGjBlma6l37947PaZTeS3pFXl/+ctfzNdaWdIr/4K0R0uXmtDeJE0+NKkLTbR0rS59TJd20EqQrmMVNGvWLDNNqMmSVpC0V0srVaGVJa2A6fd3VckLLofRkk4nBqcUBw8e3PT4Z599Jj/5yU9MpSl02lYrdqFXYOq4tbxaM0gTKY1Hn6/LO+iY6RWH2qOmf2c65anLVADAnpBoAS7RqSqllRWdggquQ6U9TKHVk9Ckoy20SqUVmNBeKl08VHuctBle+8KCTj755KZKWyjtuQomDi2b1HXqU6tQjz32WFOVJ3R5iCeffNIkWVqJajktqQ3qmqTpNKSOhy5fob1Owb4pdcYZZ8hhhx1mEk6tHoXSle11vargNGBwFXlNdnQh0tDeLaWfTadg9fmaDIVWs0499dSdPnfLz6p0JXrtTwsuMaG9ZJokBquRLcde42lLggwgOZBoAS7RxEoTAe310WQiuCjmN998E9X3CV4lGA6dqmwLTZCCC3tqZSi076zlrXGCNCnSBFC3oBtuuKHZn0HBxUd1CjO4JpUmd6effrppdA827GsFSpMp/VOTtZZXH2pvlyZgWtHSW/ToAqR6Jad66623druOVpD2d+n0pP69aUWutSsTQ1ek1yS1tYQNABSJFuASre5oRUYXwNSr5HR6q7i4OC7HW5OnTz/91HzdsqqkQqfgtErWssG/NVr50t6rlhWtXdHqkCZXOv1YWloq//d//9dsCjPU8uXLTYLXvXv3psd0ivbaa681V23qlZ26xIQuwKpXMoYjWOnTylzwilHtUdOrJ7WypdUs7cfSaUWtlun7aL8aSRaA3SHRAlykV7DpD2O9ybEuP6CJhCYqbgv2HmlVKZznagO7bq0doyVNQrSxP0gb7nW9MO29Cp1a056m4HSkPq7f13j0a03AdPpPpzJD6VSqTnXqzalbu32Q9l/pVJ02weuaZMGpSk3SdJ0sXVn+4YcfNhUtrUjp/RU1aVPaBxZc9T44Nlq90mlBrXq1dqGANueH0s+kNMmiRwtAWFpdxhSA1R5//HGzGvrXX3+9x+eed955gXHjxjXtV1ZWmteuWbOm6TFdcf3KK680q7rr95544omm723dujWw9957m9XhdcX71syZM8es+q6rt3s8nkCnTp0C8+bNi/hz6cry+v577bVX4KWXXmp6fMqUKebx3/3ud82e/8UXX5hNY2ttS09PNyvjq/vuu8+sDL87b7/9tnmfr776KuLYASQnR/8nvJQMgC10+k3XqtL+rZaLe7akPUg6DadLMASn4LQKNHbs2Ka+LF2LSve1QqSryJ933nnNjqELemolKtLFUyOl05t6JaAuKBq6aKpOfWo/1eTJkyNeVywSeuWorvW1YcOGVq/oBICWSLQAAABcwoKlAAAALiHRAgAASXV7tLy8vKYrrfdEr4DWdgW9AGbmzJkRvx+JFgAASJok67TTTgs7ydJ1+UaNGmV6VJcsWWLW+dNbe0WCRAsAACSFc889t+nCn3BoYrXffvuZRZb1fqd6B46SkpLEb4bX9X30/me65o6bVxgBAJAI9Ed9TU2NSRpa3iqrPXz33XdmTUE3BNfnC6VXJYdemRykVwzrtKE+P5yrh3V9Pl2wWK9qDt4L9mc/+1nTosYJu2CpJlmt3RgXAADsWjhLvriRZHXsnCPSUOvK8Tt16mSWeAkVvCtFS8HbfYXL7/eb+50G6V0xdnWj+4RKtLSSpVL6jRPHmyI2+WzxHbEOAQCQZGr8fjkgL7fp52d7qtdKVkOtpPYbJxLtn9nb62VrxTyTQIbeGqy1alZb6B0qQo+VlpZmbi8W0THEQsESoSZZtiVa4dwjDgAAN8S03caXFvWf2QHH0/Sz1Y2fr3q7LW2ID9LpV71tVyRohgcAAGhFYWGhudowaNWqVc1uZh8OEi0AAOA+x5TUorxFJzTtxfr+++93elyXdnj77bfltddeM9+/7bbb5OSTT47o2CRaAAAgqRUUFMgLL7yw0+O6SOldd90lw4cPl65du8ratWvl+uuvT/weLQAAYBnHs2OL9jHboOXKVrtbwPSyyy4zVayPPvpIBg4caK5yjASJFgAAcJ/zw3RftI/ZDnRZiEiXhghi6hAAAMAlVLQAAEBSTR22p/iPEAAAwFJUtAAAgPsce3u0fgwqWgAAAC6hogUAANqBx4WeqvivF8V/hAAAAJaiogUAANznJGePFokWAABwn8PyDhCRnIZtsqZivvSs8zMeAADgR6FHq0WS9cz6F6R3fY3YxCkvl5SiQkntkiW+KZP1Jk5iA1vjtjl24ma8E/k8sTl2W+Nu09ShE+UtzsU00SovL5fCwkLJysqSyZMn73STx/Y2/9NX5ImsPmKVujrpMHqkNB7ZX+qXrhBnTYV4582VuGdr3DbHTtyMdyKfJzbHbmvciO9Eq66uTkaOHCn9+/eXFStWSEVFhcydG9sTa2LuYJnV5TCxieelReJUV0vDHTMlkJ8vDVOLxVtaIvHO1rhtjp24Ge9EPk9sjt3WuNvco+VEeYtzMYtw0aJFUl1dLTNnzpT8/HwpLi6WkpLYnlgbUzPENs7qMmkcUCSSnm72AwUF5reheGdr3DbHTtyMdyKfJzbHbmvciPNEq6ysTIqKiiT9hxOroKDAVLV2Vf3y+/3NNuzg+P0S6J333+HQ+WqvV6SqKq6HyNa4bY6duBnvRD5PbI7d1rgj5tCj1a40WcrL+++J5TiOeL1eqWrlxJo2bZpkZmY2bbm5ue0bbDzz+URSU5s/lpYmUlsrcc3WuG2OnbgZ70Q+T2yO3da4Ed8VLZ/PJ6ktTqy0tDSpbeXEuvbaa800Y3CrrKxsx0jjWyA7W5zNm5s/WFMjkpIi8czWuG2OnbgZ70Q+T2yO3da4I+bQo9WusrOzZXOLE6umpkZSWjmxNCHLyMhotmGHwFGF4ixb0jQczoYN5goWyc6O6yGyNW6bYyduxjuRzxObY7c17rZNHXqivLG8wy7psg5Llvz3xNqwYYPpxdIEDOFrHDjIzO9755aafe/0YmkcMnTH/H4cszVum2MnbsY7kc8Tm2O3NW6ExwnEaPGqhoYG2W+//WTGjBkyfvx4mTBhgmzatEkWLlwYVn+X9mqlHjpBHK9dpdWq5fdG/ZiehQukw/ljRTp2FPF4pP61xRLo10/ina1x2xw7cTPeiXye2By723Hrz82uOZmm/aa9Z4X8wZ/Zx18nji8tqscONHwndf8sjsnnivtESy1YsEDGjh0rHTt2FI/HI4sXL5Z+YZxYJFqt2LRJPO+t3HGJcE6OWMPWuG2OnbgZ70Q+T2yO3cW4SbSSNNFSWsVauXKlWeohJ8wTi0QLAADLEq2B17tT0Xrr1riuaPliHUC3bt1kxIgRsQ4DAAAg8RItAACQBBwXbgLNVYcAAADJi4oWAABwn+PCTaAtuKk0iRYAAHCfw9QhAAAAooiKFgAAcJ+TnFOH8R8hAACApahoAQAA9zn0aAEAACCKqGgBAAD3OfRoAQAAIIqoaAEAAPc5ydmjRaIFAADagceF5Rjif/GE+I8QAADAUlS02llW4SSxUdXye2MdAgDAZk5yTh1S0QIAAHAJFS0AANBOFS1P9I8Z56hoAQAAuISKFgAAcJ/DgqUAAACIIipaAADAfQ5XHQIAACCKqGgBAAD3OcnZo0WiBQAA3OcwdQgAAIAooqIFAADc5yTn1GH8RwgAAGApKloAAMB9Dj1aAAAAiCIqWgAAwHWO45gtygeVeEePFgAAgEuoaAEAANc5SVrRItECAADuc37Yon3MOMfUYQs5DdtkTcV86VnnF5vYGLdTXi4pRYWS2iVLfFMmiwQCYgtbYyduxjuRzxObY7c1buwZiVaLZOWZ9S9I7/oasYmVcdfVSYfRI6XxyP5Sv3SFOGsqxDtvrljB1tiJm/FO5PPE5thtjbuNU4dOlLd4R6IVYv6nr8gTWX3ENjbG7XlpkTjV1dJwx0wJ5OdLw9Ri8ZaWiA1sjZ24Ge9EPk9sjt3WuBEeEq0QE3MHy6wuh4ltbIzbWV0mjQOKRNLTzX6goMD8FmcDW2MnbsY7kc8Tm2O3Ne5IOVS0sDE1w8pBsDFux++XQO+8kAccEa9XpKpK4p2tsRM3453I54nNsdsaN8JDRQux4fOJpKY2fywtTaS2Nv7/RmyNnbgZ70Q+T2yO3da4I+RQ0QLaTyA7W5zNm5s/WFMjkpIS938NtsZO3Ix3Ip8nNsdua9wIDxUtxETgqEJxli1p2nc2bDBX3kh2dtz/jdgaO3Ez3ol8ntgcu61xR8qhogW0n8aBg0xfgnduqdn3Ti+WxiFDd/QlxDlbYyduxjuRzxObY7c17jYvWOpEeYtzrAyPGJ15Pvn+wTnS4fyx4rtmsojHI/WvLbbjb8PW2Imb8U7k88Tm2G2NG2FxAgH7lp/1+/2SmZkpqYdOEMfLHHZ7qFp+rzsH3rRJPO+t3HFpc06OWMXW2Imb8U7k88Tm2F2MW39uds3JlOrqasnIyIjJz+yMn/9VnA4do3rswPfbxP/UJTH5XOEi0UJsEy0AgOtItGKHqUMAAOA6x9nREB/dg0rc46pDAAAAl1DRAgAArnPEjZtAx39Ji4oWAACAS6hoAQCAdluwNKqiXiGLPhItAADgPseFmb74z7OYOgQAAHALFS0AAOA+J/pThwELpg5phgcAAHAJFS0AAGBlM7xDRQsAACB5UdECAACuc6hoAQAAJK7y8nIpLCyUrKwsmTx5sgQCgd0+X79/+eWXS3Z2tuy9995ywQUXyLZt2yJ6T5rhAQBA+62j5UR5C1NdXZ2MHDlS+vfvLytWrJCKigqZO3fubl/z0EMPydq1a2XVqlXy1ltvyYcffijTpk2L6GOTaAEAgHabOnSivIVr0aJFUl1dLTNnzpT8/HwpLi6WkpKS3b7m3XfflTFjxkivXr3k0EMPlTPOOEM++eSTiD43iRYAAEh4ZWVlUlRUJOnp6Wa/oKDAVLV25+CDD5aHH35Yvv76a9m4caM8/vjjMmzYsIjel2Z4hCWrcJK1I1W1/N5YhwAASc9xsRne7/c3ezw1NdVsofQ5eXl5zV7r9XqlqqrK9Gy15uKLL5b7779funXrZvZ16nHcuHERxUhFCwAAWC03N1cyMzObttb6qHw+307JV1pamtTW1u7yuHfffbdpgtdq1meffSYNDQ2miT4SVLQAAIDVFa3KykrJyMhoerxlQqX0ykG96jBUTU2NpKSk7PL4jzzyiNxyyy3Ss2dPs68J3AknnCB33nln2DFS0QIAAFbLyMhotrWWaOmyDkuWLGna37Bhg7kSUROwXWlsbJR///vfTfubNm2S7du3RxQbFS0AAJDwC5YOGjTI9GmVlpbK+PHjzVWHQ4cONX1aW7Zskc6dO5uvQw0cOFCmT59uHq+vr5cZM2bIqFGjIoqRRAsAACQ8n88nc+bMkbFjx5o+K4/HI4sXLzbf02Z4XSvr8MMPb/aaW2+91SRnv//9780048knn2z6tiJ636h+CgAAgNY4kS0wGpYIj6fVqHXr1snKlSvNUg85OTnm8V2tEK+N8PPnz/9RIZJoAQCAhJ86DNKlGkaMGCHthWZ4AAAAl1DRAgAASVPRam9UtAAAAFxCRQsAALjOoaIFAACAaKKiBQAAkmJ5h1igRwsAAMAlVLQAAIDrnCTt0SLRAgAArnOSNNFi6hAAAMAlJFot5DRskzUV86VnnV9sQtztyykvl5SiQkntkiW+KZP1RlliA+JmvBP5PLE5dlvjjoSj/zlR3izoho9povXcc8/J/vvvb+6orXfMXrNmTcyTlWfWvyC962vEJsTdzurqpMPokdJ4ZH+pX7pCnDUV4p03V+IecTPeiXye2By7rXEjvhMtvXv2+PHjZfr06fLFF19I37595eKLL5ZYmv/pK/JEVh+xDXG3L89Li8SprpaGO2ZKID9fGqYWi7e0ROIdcTPeiXye2By7rXFHyol2NcuFnq+ESrS0eqVJ1tlnny1du3aVyy+/XFatWiWxNDF3sMzqcpjYhrjbl7O6TBoHFImkp5v9QEGB+Q003hE3453I54nNsdsaN+L8qsPTTjut2f7atWulT5/YVpM2pmaIjYi7fTl+vwR654U84Ih4vSJVVSJZWRKviJvxTuTzxObYbY07Yg4LlsZMfX293HnnnXLZZZe1+v26ujrx+/3NNiBmfD6R1NTmj6WlidTWSlwjbsY7kc8Tm2O3NW7Yc9XhjTfeKHvttdcue7SmTZsmmZmZTVtubm67xwgEBbKzxdm8ufmA1NSIpKTE9SARN+OdyOeJzbHbGnekHHq0YuONN96Q++67Tx599FHp0KFDq8+59tprpbq6ummrrKxs9ziBoMBRheIsW9K072zYYK4akuzsuB4k4ma8E/k8sTl2W+OOlEOi1f42bNggY8eONYlWv379dvm81NRUycjIaLYBsdI4cJDpqfDOLTX73unF0jhk6I6eijhG3Ix3Ip8nNsdua9yI82b4bdu2mYb4008/XUaPHi1bt241j+sUog2XayKJ+Xzy/YNzpMP5Y8V3zWQRj0fqX1sscY+4Ge9EPk9sjt3WuCPkODu2aB8z3jmBQGyWn9XFSs8444xWq1y9e/fe7Wu1GV57tVIPnSCON7HmsBF9VcvvdWdYN20Sz3srd1yWnZMj1iBuxjuRzxObY3cxbv252TUn07TftPeskP+Hn9l5k54WT+qOJSyipbGuVjbcOyYmnyvuE60fg0QLcZFoAYAl4iHR2v/XmmjtFdVjN9Z9K+v/Et+JVlxcdQgAAJCIYtajBQAAkojjQk+VBT1aVLQAAABcQkULAAC4znHhJtA2rFJARQsAAMAlVLQAAIDrnCRdR4tECwAAuM7jccwWTYEoH88NTB0CAAC4hIoWAABwnZOkU4dUtAAAAFxCRQsAALjOYXkHAAAARBMVLQAA4DqHHi0AAABEExUtAADgOidJe7RItAAAgOscEi0gMWUVThIbVS2/N9YhAAB+JCpaAADAdQ7N8AAAAIgmKloAAMB1jrjQDC/x3wzPLXgAAABcQkULAAC4zqFHCwAAANFERQsAALjOYR0tAAAAtxItMVu0jxnvaIYHAABwCVOHAADAdU6STh1S0QIAAHAJFS0AAOA6hx4tAAAARBMVLQAA4DqHHi0AAABEExUtAADgPseFda/i/6JDEi0AAOA+h6lDqJyGbbKmYr70rPNbNSDEzXgDAOIP62i1SFaeWf+C9K6vEZsQN+MdLqe8XFKKCiW1S5b4pkwWCQTEBsTNeHOuJM7yDk6Ut3hHohVi/qevyBNZfcQ2xM14h6WuTjqMHimNR/aX+qUrxFlTId55cyXuETfjzbkCi5FohZiYO1hmdTlMbEPcjHc4PC8tEqe6WhrumCmB/HxpmFos3tISiXfEzXhzriRWj5YT5S3ekWiF2JiaITYibsY7HM7qMmkcUCSSnm72AwUFpqoV74ib8eZcgc1ItIAk4fj9EuidF/KAI+L1ilRVSTwjbsabcyUxOPRoAUhoPp9Iamrzx9LSRGprJa4RN+PNuQKLUdECkkQgO1uczZubP1hTI5KSIvGMuBlvzpXE4NCjBSCRBY4qFGfZkqZ9Z8MGc0WfZGdLPCNuxptzJTE4JFoAElnjwEGm38k7t9Tse6cXS+OQoTv6tOIYcTPenCuwmRMIWLJiYQi/3y+ZmZmSeugEcbzxPe0BtFXV8nujPniehQukw/ljRTp2FPF4pP61xRLo10/iHXEz3pwrP/7nZtecTKmurpaMjIyY/Mw+tvhl8aXtFdVjN3z3rbxz3ckx+Vzh4qbSQBJpHDlK6tauE897K3cs9ZCTIzYgbsabcwW2ItECkk23btI4fIRYh7gZb84VqzncVBoAAADRREULAAC4znHhJtAW3IGHdbQAAADcQkULAAC4zknSHi0SLQAA4DrHham++E+zmDoEAABwDRUtAADgOo/jmC3ax4x33FQaAADAJVS0AACA6xyWdwAAAEA0UdECAACuc5J0eQd6tAAAAFxCogUAAFzncdzZIlFeXi6FhYWSlZUlkydPlkAgENbrGhsb5dhjj5U777wz8s8d8SsAAAAi5fx3+jBaWyQrltbV1cnIkSOlf//+smLFCqmoqJC5c+eG9doHHnhAqqur5Te/+U3EH5tECwAAJLxFixaZZGnmzJmSn58vxcXFUlJSssfXffnll3LdddfJX/7yF+nQoUPE70szPBCnsgoniY2qlt8b6xAAJNnyDn6/v9njqampZgtVVlYmRUVFkp6ebvYLCgpMVWtPrrzySunVq5dUVlbKO++8Y6YQI0FFCwAAWC03N1cyMzObtmnTpu30HE3G8vLymvZ16tHr9UpVVdUuj7tkyRJ56qmnpEePHrJu3ToZN26cTJoU2S/BVLQAAIDrnB/+i/YxlVabMjIymh5vWc1SPp9vp8fT0tKktrbWNMe3Zvbs2TJgwAB5/vnnTWI2YcIEU9369a9/LQceeGBYMVLRAgAAVsvIyGi2tZZoZWdny+bNm5s9VlNTIykpKbs87ueffy7Dhw9vWq9LK2ddunQx1a1wkWgBAICEX96hsLDQTAUGbdiwwVyJqAnYruiU4bZt25r2t27dKt9884107949/M8dfogAAAB2GjRokOnTKi0tNft61eHQoUNNn9aWLVtk+/btO71m7NixZvrw9ddfl40bN8rEiRPloIMOMo304aJHCwAAJPwteHw+n8yZM8ckT7pYqcfjkcWLF5vvaY/WqlWr5PDDD2/2mmHDhsmMGTPk8ssvN31g+v2nn346sveN4PMAAADE3fIO4Ro1apTpr1q5cqVZ6iEnJ8c8vrsV4i+66CKztRWJFgAASBrdunWTESNGtNv7kWgBAADXeRzHbNE+ZryjGR4AAMAlVLQAAEBS9GjFAhUtAAAAl1DRAgAACb+8Q6xQ0QIAAHAJFS0AAOA6J0l7tEi0AACA6zws7wAAAIBookerhZyGbbKmYr70rPOLTYib8U7k8wSA/RyXtqRMtK666qqInn/KKafI3LlzJR5+CD2z/gXpXV8jNiFuxjuRzxPllJdLSlGhpHbJEt+UyXpjMrEBcTPmiX6uIIqJ1i9+8Qu54IIL5JZbbpHnn39e7rjjjl0+d9myZeEeVh555BF5+eWXJR7M//QVeSKrj9iGuBnvRD5PpK5OOoweKY1H9pf6pSvEWVMh3nmx/8Vsj4ibMU/0c6WNyzs4Ud4SJtFavny5HHPMMfLaa6/J9u3bZdu2bfL000/L3/72N5k/f36zLVzffPON/Pa3v5UDDzxQ4sHE3MEyq8thYhviZrwT+TzxvLRInOpqabhjpgTy86VharF4S0sk3hE3Y57o5wqifNXhvvvuK5deeqk8/PDDJoP0eDxy2223yRFHHCEbN26UsrIyOeqoo2SvvfaSQJglT02yRo8ebZK2eLAxNUNsRNyMdyKfJ87qMmkcUCSSnm72AwUF5jf+eEfcjHminyuR8jg7tmgfM2EqWp999pkUFxfvVKYbN26cDB48WPbZZx8ZNmyYSZw6duy4x+O9+eab8vrrr5tkbU/q6urE7/c32wAkB8fvl0DvvJAHHBGvV6SqSuIZcTPmiX6uIMqJVkpKinTt2tVUqz744IOdEq5I5km/++47Ux27//77pXPnznt8/rRp0yQzM7Npy83NDfu9AFjO5xNJTW3+WFqaSG2txDXiZswT/VyJkEOP1u5169ZNLrroIqmtrZXp06fLn/70J5NwtcXUqVOlsLBQRowYEdbzr732Wqmurm7aKisr2/S+AOwTyM4WZ/Pm5g/W1OhvfxLPiJsxT/RzBVHu0friiy/knnvuMT1Y2qe1ZMkSefHFF6UtHn30Udm8ebPsvffeZl+TtyeffFLeffddmTVr1k7PT01NNRuA5BM4qlCcktlN+86GDeYqLcnOlnhG3Ix5op8rbeFY0FMVs6nDtLQ06dWrl5k61PKfTvnpY23x1ltvSXl5ubz//vtmGzVqlFk2QjcACNU4cJDpYfHOLTX73unF0jhk6I4eljhG3Ix5op8rkXKSdOow7IpWTk6OnH766c3Wz9Kka8aMGbJp0yZT8XriiSdM8rV169bdHqtHjx7N9jt16mSa6XWLBx0P/5XYiLgZ74Q8T3w++f7BOdLh/LHiu2ayiMcj9a8tlrhH3Ix5op8riG6i9fHHH8uFF15ovtZ1tHS76667zLSft0XW/cc//lEiEQ+rwgOIX40jR0nd2nXieW/ljsvgc3LEBsTNmCf6uRIJT5Iu7xB2oqUN6Xrl4amnnmqWW6ivr5fjjz++1edOmTIlmjECgF6RI43Dw7uAJq4QN2Oe6OcKopNoXXHFFeE+1VydCAAAEORGT5UNPVqu3FT68ssvd+OwAAAAVoko0dLm9wkTJuz2Odqvpf1bAAAAQY5LW0IlWlqie+qpp8zXy5Yta/WehnoPxJbN8QAAAMko7B6t0ERKb6EzcOBAycrKkpNOOkmGDBliNr01jg3zpQAAoH15HMds0T5mQlW0vv76a/OnrpWlC47ed9990r17d5k/f74cdNBBsv/++zNtCAAAduI47mwJUdGqqKgwN4HWGzoH9e3b12xjxowx+99//728/fbbMmzYMPeiBQAAsEhYFS1Noo477jh55JFHWk3CdLV4vUG0z+dj6hAAAOzE4RY8u3bYYYeZLdTdd98tDz74oHz22Wfys5/9zCRaeXl5nFoAAABtbYb/9ttvzZ8ZGRnmVjsjR46Uvfbaq+n7rV2JCAAAkpvjQk9VwvRoqY0bN0qvXr3k2WefNfvjx4/f6TmaZBUWFpqGeJZ4AAAAyS6sROvFF1+UM844w2za/P7KK6/s8rla5dLna6ULAAAgmZd3CCvROvHEE+Xxxx+XJ598Un75y19KQ0OD9OjRQ/Lz83eaKtTG+ZqaGhItAACQ9MJKtNLT0+XMM88025dffikzZsyQkpISOeuss2Tq1KnNerQAAABacpK0Ryvim0rvt99+5orDlStXytatW80GAACwOw7LO0TmwAMPlL/+9a+cVQCaySqcZO2IVC2/N9YhAEgwEVe0gvTWO1dffbUsXbo0uhEBAICETDg8Lmzxrs0xzps3T+rr6+XnP/+59O7dW37/+9+b6UQAAAD8yERr6NChcu+990plZaU89dRT0qFDB3OfwwMOOEBuuOEG2bx5c1sPDQAAEoyTpD1aP7rqplOHmmjp8g8pKSly8skny7///W/zJwAAQDKL+BY8QZMmTZLnnntO6urqZPTo0TJ79myz3pbH4zH3P9RmeQAAAKXFJ08SLu/Q5kTru+++M2tpDRkyZKfb7fzkJz+RdevWRSM+AAAAa7U50ZozZ84uv5eWlmbW2wIAAFAeFypa0T5eXPVorVmzRrZt2xbdaAAAQEJyaIaPjE4ZLlu2zKW/DgAAAPu1uaI1btw4KS0tjW40AAAgoacOPVHeEnodrX/9618yYsQIWbRokfzjH/9o2gAAAPAjmuEvvvhi8+dXX30lEydObDYHu379esYWAAA0W4oh2ssxJPTyDhs2bIhuJAAAAAmmzYkWAABAuDyOY7Zoivbx4qpH6/vvv5fi4mIZMGCAdO/eXT788EM5+uijWagUAADgxyZa2pf15JNPyoUXXig1NTWSnp4uxx57rFx66aVtPSQAAEjghMPjwpawU4dPP/20rFixQvLz8+Waa64xt+GZMmWK9O3bN7oRAgAA6zlJ2gzf5mQwNze32VIOerWhTh/m5eWJzXIatsmaivnSs84vNiFuxjuRzxPbYweQvNqcaN12221y+eWXyzHHHCO1tbVy9dVXy/nnny933HGH2Er/IX9m/QvSu75GbELcjHcinyc2x+6Ul0tKUaGkdskS35TJIoGA2MDWuG2O3da4I+GRHc3wUd3ESdxE65RTTpHy8nI57bTT5KKLLpIjjjhC3nnnHTnppJPEVvM/fUWeyOojtiFuxjuRzxNrY6+rkw6jR0rjkf2lfukKcdZUiHfeXIl7tsZtc+y2xo2w/Kg+sgMOOED+8Ic/yKxZs+S6666T/fffX2w2MXewzOpymNiGuBnvRD5PbI3d89IicaqrpeGOmRLIz5eGqcXiLS2ReGdr3DbHbmvcbe3RcqK8JWwzvMfjMX1Zrdm+fbvYaGNqhtiIuBnvRD5PbI3dWV0mjQOKRNLTzX6goMBUKuKdrXHbHLutcaMdV4bXHi29AvH22283FS4ASHaO3y+B3iEXB+kvpl6vSFWVSFaWxCtb47Y5dlvjjpTHhZtA23BT6TYnWr169Wq2/9Of/tT0bY0cOVLOOeecaMQGAPby+URSU5s/lpamv5nG9w9PW+O2OXZb40ZYorrWV8eOHWXTpk3RPCQAWCmQnS3O5s3NH6ypEUlJkXhma9w2x25r3JFyTEUrulcdJnSP1uDBg5v1aDU2NkpFRYUMGzYsWrEBgLUCRxWKUzK7ad/Rdou6OpHsbIlntsZtc+y2xh0phwVLw1dZWWmmB3UpB93OPvtsGTJkiDz++OPyyCOPuPjXBAB2aBw4yPTeeOeWmn3v9GJpHDJ0R+9NHLM1bptjtzVuhMcJBMJfFe2zzz6T8847T95++23p3LmzZGZmmse3bNki3377rbnB9DfffCMPPPCAnHjiieIWv99v3jv10AnieBOrtAogdqqW3xvV43kWLpAO54/Vvgq9VFvqX1ssgX79JN7ZGrfNsbsdt/7c7JqTKdXV1ZKR0b5X8fp/+Jl9/XPvSdpenaN67O++rZFbTz8yJp/LlanD8ePHS9euXWXjxo3mFjxK87R77rlHbrrpJvn4449ln332kYEDB7oVLwBYo3HkKKlbu048763ccfl+To7YwNa4bY7d1rgR5URLV35fs2ZNU5KltE9Lky+tYunteA488EBzg2kAgIh06yaNw0fYNxS2xm1z7LbGHSbnh/+ifcyEuuqwT58+Mm/evJ0eP/fcc03P1pw5c6Rv377RjA8AACA5Klp6q53TTz/dJFsFBQVNPVpVVVWyevVq2bZtmyxYsMCtWAEAgKU8LFi6Z8cff7ysX79eFi5cKB988IFJsNTBBx8sY8eONTeY1iZ5AAAAtGEdLa1inX/++YwdAAAIm4eKFgAAgDscs5J7lJvhLVgaPqq34AEAAEAUbsEDAAAQLk+STh1S0QIAAHAJFS0AAOA6h5tKAwAAIJqoaAEAANd5HMds0T5mvKNHCwAAwCUkWgAAoN2uOvREeYtEeXm5FBYWSlZWlkyePFkCgUDYr92yZYvsu+++8umnn0b2uSMLEQAAoA2c/zbER2vTY4arrq5ORo4cKf3795cVK1ZIRUWFzJ07N+zXa2K2adOmiD82iRYAAEh4ixYtkurqapk5c6bk5+dLcXGxlJSUhPXaf/zjH7JgwQLJycmJ+H1JtAAAgOs84riyKb/f32zT6lVLZWVlUlRUJOnp6Wa/oKDAVLX2RI916aWXyj333COdOnWK+HNz1SEA/CCrcJKVY1G1/N5YhwDEVG5ubrP9G2+8UW666aZmj2kClpeX1+w+iV6vV6qqqkzP1q5o5atv375yzjnnyJQpUyKOjUQLAABYvWBpZWWlZGRkND2empq603N9Pt9Oj6elpUltbe0uE601a9bIAw88IKtWrWpzjCRaAADAahkZGc0SrdZkZ2ebqw5D1dTUSEpKSqvP1ysSL7nkErn11ltlv/32a3Ns9GgBAICEX96hsLBQlixZ0rS/YcMG03+lCVhrPvvsM/nnP/9prjbce++9zaaPaW/Xo48+Gvb7UtECAAAJb9CgQaZPq7S0VMaPH296r4YOHWr6tHSNrM6dO5uvg7p3726SsVDHH3+8PP7443L44YeH/b4kWgAAIOFvwePz+WTOnDkyduxYU6XyeDyyePFi8z3t0dI+rNAESp/fu3fvnY7Ro0ePiK4+JNECAABJYdSoUbJu3TpZuXKlWeohuC5WuCvER7oqvCLRAgAAVl91GIlu3brJiBEjpL2QaAEAANd5xIWpw0juwRMjXHUIAADgEipaAAAgaaYO2xsVLQAAAJdQ0QIAAO1S2fG4cMx4Z0OMAAAAVqKiBQAAXOc4jtmifcx4R0ULAADAJVS0AACA65wftmgfM96RaAEAgIS/12GsMHXYQk7DNllTMV961vnFJsTNeCfyeWJz7DbG7ZSXS0pRoaR2yRLflMl6Izixha2x2xo3LEm0pkyZIiNHjoyLfxCfWf+C9K6vEZsQN+OdyOeJzbFbGXddnXQYPVIaj+wv9UtXiLOmQrzz5ooVbI3d1rh/xPShE6XNBjFPtFavXi2zZs2Su+++O9ahyPxPX5EnsvqIbYib8U7k88Tm2G2M2/PSInGqq6XhjpkSyM+XhqnF4i0tERvYGrutccOCRKuxsVEuueQSueqqq2T//feXWJuYO1hmdTlMbEPcjHcinyc2x25j3M7qMmkcUCSSnm72AwUFpsJiA1tjtzXutt6Cx4nyFu9immg98MAD8sEHH0jv3r1lwYIFUl9f3+rz6urqxO/3N9vcsDE1Q2xE3Ix3Ip8nNsduY9yO3y+B3nkhDzgiXq9IVZXEO1tjtzVuxHmitXXrVrnxxhtNJWvjxo1y1113yfHHHy/btm3b6bnTpk2TzMzMpi03NzcmMQNAwvP5RFJTmz+WliZSWytxz9bYbY27jQuWOlHe4l3MEq1nnnlGvv32W3nzzTfl5ptvlldffVVqamrkoYce2um51157rVRXVzdtlZWVMYkZABJdIDtbnM2bmz9YUyOSkiLxztbYbY0bcZ5off7551JUVCT77LOP2ff5fFJQUCCffPLJTs9NTU2VjIyMZhsAIPoCRxWKs2xJ076zYYO5Kk6ys+N+uG2N3da423pTaU+Ut3gXsxh79Oix0zShTiF27949ViEBQNJrHDjI9Ax555aasfBOL5bGIUN39AzFOVtjtzXuSDlJOnUYs5XhR4wYIb/+9a9NQ/xpp51mphLLysrkqaeeilVIAACfT75/cI50OH+s+K6ZLOLxSP1ri+0YF1tjtzVuhMUJBGK3/Ozbb78tv/vd70yCte+++8qf//znsBYu1asOtSk+9dAJ4niZwwaQ3KqW3xv9g27aJJ73Vu5YdiAnR6xia+wuxq0/N7vmZJo+5/Zuv/H/8DN77lsfSXqnzlE9du3WGrlg4EEx+VxW3OvwuOOOkyVL/jsvDQCIE926SePwEWIlW2O3NW7sFjeVBgAArnNc6KmyoUfLhoZ9AAAAK1HRAgAArvO4UN2xoVpkQ4wAAABWoqIFAABc5yRpjxaJFgAAcJ3zwxbtY8Y7pg4BAABcQkULAAC4znF2bNE+ZryjogUAAOASKloAAMB1HnHMFu1jxjsqWgAAAC6hogUAAFzn0KMFAACAaKKiBQAAXOf88F+0jxnvSLQAAIDrHKYOAQAAEE1UtADAclmFk8RGVcvvjXUIaEeOC8s72DB1yPIOAAAALqGiBQAAXOfQowUAAIBooqIFAABc51DRAgAAQDRR0QIAAK5zWLAUAADAHR5nxxbtY8Y7lncAAABwCVOHAADAdU6STh1S0QIAAHAJFS0AAOA6h+UdAAAAEE1UtAAAgOscF3qq4r9Dix4tAAAA11DRAgAArvMk6TpaJFoAAMB1Dss7AAAAIJpYR6uFnIZtsqZivvSs84tNiJvxTuTzxObYibt9OeXlklJUKKldssQ3ZbJIICA2sDXutizv4ER5i3ckWi3+QXxm/QvSu75GbELcjHcinyc2x07c7ayuTjqMHimNR/aX+qUrxFlTId55cyXu2Ro3wkKiFWL+p6/IE1l9xDbEzXgn8nlic+zE3b48Ly0Sp7paGu6YKYH8fGmYWize0hKJd7bG3bblHSTqW7wj0QoxMXewzOpymNiGuBnvRD5PbI6duNuXs7pMGgcUiaSnm/1AQYGpDsU7W+NGeLjqMMTG1AyxEXEz3ol8ntgcO3G3L8fvl0DvvJAHHBGvV6SqSiQrS+KVrXFHyiOOeKLcVKXHjHdUtAAAicHnE0lNbf5YWppIba3ENVvjRlhItAAACSGQnS3O5s3NH6ypEUlJkXhma9yRcujRAgDAXoGjCsVZtqRp39mwwVzRJ9nZMY0rUeOOmJOcmRYVLQBAQmgcOMj0O3nnlpp97/RiaRwydEe/UxyzNW6Eh2Z4AEBi8Pnk+wfnSIfzx4rvmskiHo/Uv7ZY4p6tcUfISdJb8DiBgH3Lz/r9fsnMzJTUQyeI402sOWwASBZVy+9158CbNonnvZU7lkzIyRFruBi3/tzsmpMp1dXVkpGREZOf2a+v+kz26hzd9/62xi9DjugZk88VLipaAIDE0q2bNA4fIdaxNe5wOS7cMif+C1r0aAEAALiFihYAAHCd40IByoKCFhUtAAAAt1DRAgAA7nOSs6TFOloAAAAuoaIFAABc5yTpOlokWgAAwHWOC8s7RH25CBcwdQgAAOASKloAAMB1TnL2wlPRAgAAcAsVLQAA4D4nOUta9GgBAAC4hEQLAAC02/IOTpT/i0R5ebkUFhZKVlaWTJ48WQKBwB5fc/PNN0t2drakpqbK6NGjpaamJqL3JNECAAAJr66uTkaOHCn9+/eXFStWSEVFhcydO3e3r3nkkUfM9tJLL8mHH34oa9askenTp0f0viRaAACg3dbRcqK8hWvRokVSXV0tM2fOlPz8fCkuLpaSkpLdvqayslLmzZsnRx99tBxwwAFyzjnnyKpVqyL63DTDAwCAhO+FLysrk6KiIklPTzf7BQUFpqq1O9dcc02z/bVr10qfPn0iipFECwAQE1mFk6wd+arl98Y6BITw+/2hu6afSreWz8nLy2vadxxHvF6vVFVVmZ6tPfn444/l2Weflffee08iwdQhAABov5KWE+VNRHJzcyUzM7NpmzZt2k5v7/P5dkq+0tLSpLa2do+hNzY2yoUXXigXX3yxHHzwwRF9bCpaAADAapWVlZKRkdG03zKhUnrloF51GEqvIExJSdnj8adOnSrffPON3H777RHHRqIFAABc57RhOYZwjqk0yQpNtFqjyzrMnj27aX/Dhg3mSkRNwHZn4cKFpoF+6dKlTf1dkWDqEAAAJLxBgwaZPq3S0lKzr1cdDh061PRpbdmyRbZv377Ta3Q5h7Fjx8pf/vIXMz25devWsKYaQ5FoAQCAhF/ewefzyZw5c2TSpEmyzz77yHPPPSczZsww39Nm+A8++GCn1/z1r3+Vb7/9VsaNGyedO3c2W79+/SL63EwdAgCApDBq1ChZt26drFy50iz1kJOTYx7f1Qrxd911l9l+DBItAACQ8OtoBXXr1k1GjBgh7YVECwAAJE+m1c7o0QIAAHAJFS0AAGD18g7xjIoWAACAS6hoAQAA1zkRLscQ7jHjHRUtAAAAl1DRAgAArnOS86JDKloAAABuoaIFAADc5yRnSYtECwAAuM5heQeonIZtsqZivvSs81s1IMTNeCfyeWJz7MTNeCO5xeyqQ72Ddm5urqSnp8uJJ54o69evl3j4B/GZ9S9I7/oasQlxM96JfJ7YHDtxM97hcsrLJaWoUFK7ZIlvymS9y7Ek6vIOTpS3eBeTREvvnH3LLbfIc889Jx999JHk5+fLBRdcILE2/9NX5ImsPmIb4ma8E/k8sTl24ma8w1JXJx1Gj5TGI/tL/dIV4qypEO+8ue08ekioRGvVqlVSVFQkRx55pPTs2VMuvPBC+eSTTyTWJuYOllldDhPbEDfjncjnic2xEzfjHQ7PS4vEqa6WhjtmSiA/XxqmFou3tEQStRfeifIW72KSaPXr10/eeOMNef/996W6ulpmzZolw4YNk1jbmJohNiJuxjuRzxObYyduxjsczuoyaRxQJJKebvYDBQWmqoXE4ItVojVmzBg54ogjzH5eXp4sW7Zsl8+vq6szW5Dfb1czLAAAu+L4/RLonRfygCPi9YpUVYlkZSXOwDnJubxDTCpa7777rixcuFCWLl0qW7ZskbFjx8rw4cMlsIvmv2nTpklmZmbTpk30AAAkBJ9PJDW1+WNpaSK1tbGKCLYnWo899pice+65MmDAAJM43XrrraZBvqysrNXnX3vttWaKMbhVVla2e8wAALghkJ0tzubNzR+sqRFJSUnIdbScKP8X72IyddjY2Cj/+c9/mvZramqktrZWtm/f3urzU1NTzQYAQKIJHFUoTsnspn1nwwZzJaJkZ0tCcVxYjiH+86zYJFoDBw6UcePGmasOu3btatbU6tatmxQUFMQiHAAAYqZx4CDTp+WdWyrbLxgv3unF0jhk6I4+LVjPCeyqMcpF+pY6XagJ1ldffSWHHHKIlJSUNDXH74k2w+uUY+qhE8TxJlZpFQAQ/6qW3xvV43kWLpAO548V6dhRxOOR+tcWS6Bfv6gdX39uds3JNO03GRntexWv/4ef2as+2SSdO0f3vWtq/HLEAd1i8rniuqLlOI7ccMMNZgMAINk1jhwldWvXiee9lTuWesjJiXVIiBJuKg0AQDzo1k0ah4+QhOWwvAMAAACiiIoWAABwnePCcgw2LO8Qk3W0AAAAkgEVLQAA4DrHhXW0or4ulwtItAAAgOuc5OyFZ+oQAADALVS0AACA+5zkLGnRDA8AAOASKloAAMB1Dss7AAAAIJqoaAEAgPZp0XKif8x4R48WAACAS6hoAQAA1znJedEhiRYAAHCfk6QrwzN1CAAA4BKmDgEAQDtwknLykEQLAIAIZRVOsmrMAtvrYx1C0iLRAgAArnPo0QIAAEA0UdECAACuc5KyQ4urDgEAAFxDRQsAALjOSdIeLRItAADgOueH/6J9zHjHgqUAAAAuoaIFAADc5yRnNzwVLQAAAJdQ0QIAAK5zkrOgRUULAADALVS0AACA65wkXd6BHi0AAACXUNECAACuc5J0HS0SLQAA4D4nObvhmToEAABwCYlWCzkN22RNxXzpWecXmxA3453I54nNsRM3453I50lbClpOlLd4R6LV4kR/Zv0L0ru+RmxC3Ix3Ip8nNsdO3Ix3Ip8nCA+JVoj5n74iT2T1EdsQN+OdyOeJzbETN+OdyOdJW5d3cKK8xTsSrRATcwfLrC6HiW2Im/FO5PPE5tiJm/FO5PME4eGqwxAbUzPERsTNeCfyeWJz7MTNeCfyeRI5x4XlGOK/pEVFCwAAwCVUtAAAgOscbsEDAACAaGLqEAAAwCVMHQIAANc5STp16AQCgYBYxu/3S2ZmpqQeOkEcb0qswwEAIK4FttdL3Qezpbq6WjIyMmLyM3vjpm+i/t567F7dsmPyucJFRQsAALTT4g5O1I8Z7+jRAgAAcAkVLQAA4DonSXu0qGgBAAC4hIoWAABwnePCDXMsKGhR0QIAAHALFS0AAOA+JzlLWiRaAADAdQ7LOwAAACCaqGgBAADXOSzvAAAAgGiiogUAAFznJGcvPMs7AAAAuIWKFgAAcJ+TnCUtbsEDAACSQnl5uRQWFkpWVpZMnjxZAoHAHl/z9NNPS69evWS//faTxx57LOL3JNECAADtto6WE+X/wlVXVycjR46U/v37y4oVK6SiokLmzp27x8TsvPPOkxtuuEFefvll+eMf/yhr166N6HOTaAEAgHZb3sGJ8hauRYsWSXV1tcycOVPy8/OluLhYSkpKdvuaOXPmyODBg+Xiiy+WQw89VCZNmiQPPfRQ4vdoBUt9ge31sQ4FAIC4F/x5Gc5UmVv8fr9rx2x57NTUVLOFKisrk6KiIklPTzf7BQUFpqq1O/qaU089tWn/6KOPlltuuSXxE62amhrzZ33FvFiHAgCAVT8/MzMz2/U9U1JSpFu3btInL9eV43fq1Elyc5sf+8Ybb5Sbbrqp2WOajOXl5TXtO44jXq9XqqqqTM9Wa1q+JiMjQ7788svET7S0Ia2yslI6d+5sBiqadFD1L0yPrwMKdzHe7Y8xZ7wTHef4zrSSpUmW/vxsb2lpabJhwwapr6937bO1zAVaVrOUz+fb6XGNrba2dpeJVsvXBJ+f8ImWx+ORHj16uPoemmSRaLUfxrv9MeaMd6LjHG+uvStZodLS0swWS9nZ2aa5PZQmn1px291rNm/eHPbzW0MzPAAASHiFhYWyZMmSpn2tsumViJpMhfuaVatWSffu3SN6XxItAACQ8AYNGmSmlUtLS82+XnU4dOhQ06e1ZcsW2b59+06vOeuss+Txxx+XDz74QLZu3Sr33HOPnHzyyRG9L4lWCzoXq010rc3vIvoY7/bHmDPeiY5zHLvqt9LlGnSJhn322Ueee+45mTFjhvme9mhpMtXSYYcdJldccYUcddRRppKlSdnEiRMlEk4gltd6AgAAtKNNmzbJypUrzVIPOTk5Yb1Gl4H44osv5IQTToi4R4tECwAAwCVMHQIAALiERAsAAMAlJFqIGW1E3H///U2D4uGHHy5r1qzhb6OdnHLKKXu8mSqiZ8qUKeZmtnCXNjrrgtN6i5UTTzxR1q9fz5Aj5ki0QuhCZrpmhl59MHny5JjeEyrRrVu3TsaPHy/Tp083DYZ9+/Y1N+2E+x555BFzF3q0j9WrV8usWbPk7rvvZshd/jdF70Gnv8B99NFH5qbBF1xwAWOOmCPR+oEuWqa/cfbv319WrFhhrjDgN373aPVKk6yzzz5bunbtKpdffrlZCA7u+uabb+S3v/2tHHjggQx1O2hsbJRLLrlErrrqKlO9hXv03w+9iuzII4+Unj17yoUXXiiffPIJQ46YI9H6waJFi6S6ulpmzpxpfhPShcxKSkpi+7eTwE477TTzAyho7dq10qdPn5jGlAw0yRo9erT5gQT3PfDAA2Ztnt69e8uCBQtcu9cbRPr16ydvvPGGvP/+++bfcq0iDhs2jKFBzJFo/aCsrMz88NG5fVVQUGCqWnCf/vC588475bLLLmO4XfTmm2/K66+/Lrfddhvj3A50FWld/FgrWRs3bpS77rpLjj/+eNm2bRvj71KiNWbMGDniiCNk7733NrdNueOOOxhrxByJ1g90Wf68vLymgdE7gesKsFVVVbH6u0ka+sNor732okfLRd99951ceumlcv/990vnzp3dfCv84JlnnpFvv/3WJLg333yzvPrqq+aGtA899BBj5IJ3331XFi5cKEuXLjW3Uxk7dqwMHz6cXlvEHInWD/TKt5a33dE7jdfW1sbi7yVpaKn/vvvuk0cffVQ6dOgQ63AS1tSpU82FHiNGjIh1KEnj888/N1VyvdVH8N8YrZTTN+SOxx57TM4991wZMGCAZGZmyq233moa5HW2AoglX0zfPY7o3bv1qsNQ+ttnpEvtI3x653T9rVMTLS37wz2ayG7evNlMqSj9BeLJJ580VQDtZUH09ejRY6dpQp1CPPbYYxluly48+M9//tPs3289z1u7UTDQnki0fqC/7c+ePbtZEqBXImoChujTH0DaEH/66aeb5mztZ1E6hajTtoiut956SxoaGpr2f/e735lqC5e/u0erh7/+9a9NQ7ye6zqVqNWVp556ysV3TV4DBw6UcePGmasO9UpmXVOrW7dupooIxBKJ1g8GDRpk+rRKS0vN+k561eHQoUNNnxai75VXXjEXG+jWMsHVK7QQ/epKqE6dOpkpreC0FqJPb1b74osvmqT26quvln333ddUEXVBTUTfWWedZZaN+fOf/yxfffWVHHLIIfLss8/SkoCY46bSIfTya53K6tixo3g8Hlm8eDFTWgAAoM1ItFrYtGmTrFy50kyr6G+kAAAAbUWiBQAA4BKWdwAAAHAJiRYAAIBLSLQAAABcQqIFAADgEhItAFGh9/V74YUXzArdQe+//758+OGHjDCApEWiBSAqXn75ZTnjjDPMbWZC7z93ySWX7PI1urikruJ9/PHHmzXr9t9/f/N1nz59pFevXubrAw44wNzDDgBsRKIFJJm5c+fK4Ycf3rR/5513mmRH15D7MZ544glzSyVd2T8QCJjHrr32WnN/xdBjh1a89F6ip556qvzzn/80d2P4zW9+Y76eMmWKTJgwwXytK6tzz1EAtuIWPEAS06m966+/Xv7+97+b+8K11ZdffmludzJ//nxz2xm9f2V6enrT9w866KCmJOvnP/+5lJSUmH29BdAnn3xiKldBeozga1966SXzp94OCwBsRKIFJKna2lpzyym98fHJJ5/8o441Y8YM+f7770316pRTTpH6+vpm39cKl95IPDT5Utu3b5errrqq2T1Fzz//fFPROvTQQ5teyz1HAdiKqUMgSemNjjt37ix/+tOfmh7TCpImOJowXXzxxVJXV2cev+iii+RXv/pV0/Nee+012W+//UyFau3atTJnzpxmN67+29/+Zvqvgv71r39JZmamLFmypFkM5eXl8sEHH5g/ddPv6zEdx2l6TL//0UcfuTwaAOAOEi0gCX388cfy4IMPmq87dOhg/tQpPO2xuuKKK2T58uXy7rvvyu23326+d/bZZ5vpxWDvlU4T6hSg3nz9jjvuMP1U+fn5TcfXG7MHj6/KyspM8nb00Uc3i0OTvCFDhphj66ZXLWqVK7ivm047/v73v2+XcQGAaCPRApKQTuNpErV69WpZuHBhUzO7NslrJUuv+ps4caIsWLDAfE+TIa1uacVJk63nnntOzjnnHPO9W265RW699dZmxx85cqS5+lArUmrFihVy0kkntToFqFOY++67r6mAPfDAA/L666+br3U78sgjxe/3t8OIAIA76NECkpAupaBX83366ady4403ymmnnSaff/65rFq1ylSeVENDg3Tq1Ml87fP55Mwzz5RnnnnGfK2VrGOOOcZ8T5OklvR1mpxpY/whhxxiqmO//OUvW41Fk6+vv/7aXGHYkiZrxx57bJQ/PQC0HxItIAkFpwt1+QWd8tOpQO2x0kqULvegdApPq01BWsHSKUJdakGnErWPanc00Xr++eflyiuvlGXLlklpaekun6vTlg8//PBOj2/evNk07AOArUi0gCTWvXt3M1V40003mWTr7rvvNo3rWoXSKwm1yqTTfurEE080q7/PmjVLXn311V0eU5vZteI1ZswY6d+/v7zzzjvyk5/8xKyvpXTqsWWSduCBBzZNRYba3fsAgA1ItIAkp1UtvWpw5cqVZg0rvRpx/fr1MmDAALOye+gU31lnnWWuTCwsLNzpOLq8gyZZxx13nGmmD9LESh/XKcdgpSzYVB/aPN/aOl7BqUsAsJUTaPkvHgC00jyv03ja9N6lS5dmS0IE6bIQ+v3hw4ebpEyrWq3RhEynLrUypguX6lSkPtZaUqXvG6x+aeKnFTcAsAmJFoA9+uKLL0wvl07xvfLKK+aWPQCAPSPRAgAAcAnraAEAALiERAsAAMAlJFoAAAAuIdECAABwCYkWAACAS0i0AAAAXEKiBQAA4BISLQAAAJeQaAEAAIg7/j9JE9E2myl+yAAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 800x600 with 2 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def create_mask(size):\n",
    "    \"\"\"创建下三角掩码矩阵(下三角为1，上三角为0)\"\"\"\n",
    "    mask = torch.tril(torch.ones(size, size))\n",
    "    return mask\n",
    "\n",
    "# 可视化掩码矩阵\n",
    "seq_len = 10\n",
    "mask = create_mask(seq_len).numpy()  # 直接获取二维数组\n",
    "\n",
    "plt.figure(figsize=(8, 6))\n",
    "plt.imshow(mask, cmap='Blues')  # 不需要索引[0]\n",
    "plt.title(\"GPT中的掩码矩阵\")\n",
    "plt.xlabel(\"Key位置\")\n",
    "plt.ylabel(\"Query位置\")\n",
    "plt.colorbar()\n",
    "\n",
    "for i in range(seq_len):\n",
    "    for j in range(seq_len):\n",
    "        text = \"1\" if mask[i, j] else \"0\"\n",
    "        plt.text(j, i, text, ha=\"center\", va=\"center\", color=\"red\")\n",
    "        \n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "064bde0a",
   "metadata": {},
   "outputs": [],
   "source": [
    "class MultiHeadAttention(nn.Module):\n",
    "    \"\"\"多头自注意力层\"\"\"\n",
    "    def __init__(self, d_model, num_heads):\n",
    "        super(MultiHeadAttention, self).__init__()\n",
    "        self.num_heads = num_heads\n",
    "        self.d_model = d_model\n",
    "        assert d_model % num_heads == 0, \"d_model必须能被num_heads整除\"\n",
    "        \n",
    "        self.depth = d_model // num_heads\n",
    "        \n",
    "        # 线性变换层\n",
    "        self.wq = nn.Linear(d_model, d_model)\n",
    "        self.wk = nn.Linear(d_model, d_model)\n",
    "        self.wv = nn.Linear(d_model, d_model)\n",
    "        \n",
    "        self.dense = nn.Linear(d_model, d_model)\n",
    "        \n",
    "    def split_heads(self, x, batch_size):\n",
    "        \"\"\"将张量分割成多个注意力头\"\"\"\n",
    "        x = x.view(batch_size, -1, self.num_heads, self.depth)\n",
    "        return x.permute(0, 2, 1, 3)  # (batch_size, num_heads, seq_len, depth)\n",
    "    \n",
    "    def forward(self, q, k, v, mask=None):\n",
    "        batch_size = q.shape[0]\n",
    "        \n",
    "        # 线性层\n",
    "        q = self.wq(q)  # (batch_size, seq_len, d_model)\n",
    "        k = self.wk(k)  # (batch_size, seq_len, d_model)\n",
    "        v = self.wv(v)  # (batch_size, seq_len, d_model)\n",
    "        \n",
    "        # 分割注意力头\n",
    "        q = self.split_heads(q, batch_size)  # (batch_size, num_heads, seq_len_q, depth)\n",
    "        k = self.split_heads(k, batch_size)  # (batch_size, num_heads, seq_len_k, depth)\n",
    "        v = self.split_heads(v, batch_size)  # (batch_size, num_heads, seq_len_v, depth)\n",
    "        \n",
    "        # 计算注意力分数\n",
    "        matmul_qk = torch.matmul(q, k.transpose(-1, -2))  # (batch_size, num_heads, seq_len_q, seq_len_k)\n",
    "        \n",
    "        # 缩放\n",
    "        dk = torch.tensor(self.depth, dtype=torch.float32)\n",
    "        scaled_attention_logits = matmul_qk / torch.sqrt(dk)\n",
    "        \n",
    "        # 掩码处理（这是GPT模型的关键特点）\n",
    "        if mask is not None:\n",
    "            scaled_attention_logits += (mask * -1e9)\n",
    "            \n",
    "        # softmax得到注意力权重\n",
    "        attention_weights = torch.softmax(scaled_attention_logits, dim=-1)\n",
    "        \n",
    "        # 注意力加权\n",
    "        output = torch.matmul(attention_weights, v)  # (batch_size, num_heads, seq_len_q, depth)\n",
    "        \n",
    "        # 拼接多头的结果\n",
    "        output = output.permute(0, 2, 1, 3).contiguous()  # (batch_size, seq_len_q, num_heads, depth)\n",
    "        output = output.view(batch_size, -1, self.d_model)  # (batch_size, seq_len_q, d_model)\n",
    "        \n",
    "        output = self.dense(output)\n",
    "            \n",
    "        return output, attention_weights\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "3b463e24",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([512, 512])\n"
     ]
    }
   ],
   "source": [
    "model = MultiHeadAttention(d_model=512, num_heads=8)\n",
    "print(model.wq.weight.shape)  # torch.Size([512, 512])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4df5c990",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### 5.2 前馈网络\n",
    "\n",
    "每个Transformer块都包含一个前馈网络，由两个线性变换组成，中间有一个非线性激活函数。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "0f402240",
   "metadata": {},
   "outputs": [],
   "source": [
    "class PositionwiseFeedForward(nn.Module):\n",
    "    \"\"\"位置前馈网络\"\"\"\n",
    "    def __init__(self, d_model, dff):\n",
    "        super(PositionwiseFeedForward, self).__init__()\n",
    "        self.linear1 = nn.Linear(d_model, dff)\n",
    "        self.linear2 = nn.Linear(dff, d_model)\n",
    "        self.gelu = nn.GELU()  # GPT使用GELU激活函数，而不是ReLU\n",
    "        \n",
    "    def forward(self, x):\n",
    "        x = self.linear1(x)\n",
    "        x = self.gelu(x)\n",
    "        x = self.linear2(x)\n",
    "        return x\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c823c4e",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### 5.3 Transformer解码器层\n",
    "\n",
    "结合前面实现的组件，我们可以构建一个完整的Transformer解码器层。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "fd39cb29",
   "metadata": {},
   "outputs": [],
   "source": [
    "class DecoderLayer(nn.Module):\n",
    "    \"\"\"GPT使用的Transformer解码器层\"\"\"\n",
    "    def __init__(self, d_model, num_heads, dff, dropout_rate=0.1):\n",
    "        super(DecoderLayer, self).__init__()\n",
    "\n",
    "        self.mha = MultiHeadAttention(d_model, num_heads)\n",
    "        self.ffn = PositionwiseFeedForward(d_model, dff)\n",
    "\n",
    "        self.layernorm1 = nn.LayerNorm(d_model, eps=1e-6)\n",
    "        self.layernorm2 = nn.LayerNorm(d_model, eps=1e-6)\n",
    "        \n",
    "        self.dropout1 = nn.Dropout(dropout_rate)\n",
    "        self.dropout2 = nn.Dropout(dropout_rate)\n",
    "    \n",
    "    def forward(self, x, mask):\n",
    "        # 多头自注意力（带掩码）\n",
    "        attn_output, _ = self.mha(x, x, x, mask)\n",
    "        attn_output = self.dropout1(attn_output)\n",
    "        out1 = self.layernorm1(x + attn_output)  # 残差连接和层归一化\n",
    "        \n",
    "        # 前馈网络\n",
    "        ffn_output = self.ffn(out1)\n",
    "        ffn_output = self.dropout2(ffn_output)\n",
    "        out2 = self.layernorm2(out1 + ffn_output)  # 残差连接和层归一化\n",
    "        \n",
    "        return out2\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3efe53e",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### 5.4 完整的GPT模型\n",
    "\n",
    "下面我们将所有组件组合起来，实现一个简化版的GPT模型：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "f9ba24d5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "模型参数总量：58.29M\n"
     ]
    }
   ],
   "source": [
    "class GPT(nn.Module):\n",
    "    \"\"\"简化版GPT模型\"\"\"\n",
    "    def __init__(self, vocab_size, d_model, num_heads, dff, num_layers, max_position, dropout_rate=0.1):\n",
    "        super(GPT, self).__init__()\n",
    "        \n",
    "        self.d_model = d_model\n",
    "        self.num_layers = num_layers\n",
    "        \n",
    "        # 词嵌入层\n",
    "        self.embedding = nn.Embedding(vocab_size, d_model)\n",
    "        # 位置嵌入层\n",
    "        self.pos_embedding = nn.Embedding(max_position, d_model)\n",
    "        \n",
    "        self.dropout = nn.Dropout(dropout_rate)\n",
    "        \n",
    "        # 堆叠多个解码器层\n",
    "        self.decoder_layers = nn.ModuleList([\n",
    "            DecoderLayer(d_model, num_heads, dff, dropout_rate) \n",
    "            for _ in range(num_layers)\n",
    "        ])\n",
    "        \n",
    "        # 输出层\n",
    "        self.final_layer = nn.Linear(d_model, vocab_size)\n",
    "        \n",
    "    def forward(self, x):\n",
    "        seq_len = x.shape[1]\n",
    "        batch_size = x.shape[0]\n",
    "        \n",
    "        # 创建掩码\n",
    "        mask = create_mask(seq_len)\n",
    "        \n",
    "        # 创建位置索引\n",
    "        positions = torch.arange(0, seq_len).unsqueeze(0).repeat(batch_size, 1).to(x.device)\n",
    "        \n",
    "        # 词嵌入+位置嵌入\n",
    "        x = self.embedding(x)\n",
    "        pos_emb = self.pos_embedding(positions)\n",
    "        \n",
    "        x = x + pos_emb\n",
    "        x = self.dropout(x)\n",
    "        \n",
    "        # 通过所有解码器层\n",
    "        for i in range(self.num_layers):\n",
    "            x = self.decoder_layers[i](x, mask)\n",
    "            \n",
    "        # 输出层\n",
    "        logits = self.final_layer(x)\n",
    "        \n",
    "        return logits\n",
    "\n",
    "# 创建一个小型GPT模型实例\n",
    "small_gpt = GPT(\n",
    "    vocab_size=10000,  # 词汇表大小\n",
    "    d_model=768,       # 嵌入维度\n",
    "    num_heads=12,      # 注意力头数\n",
    "    dff=3072,          # 前馈网络维度\n",
    "    num_layers=6,      # 解码器层数\n",
    "    max_position=512,  # 最大位置编码\n",
    "    dropout_rate=0.1   # 丢弃率\n",
    ")\n",
    "\n",
    "# 打印模型概览\n",
    "print(f\"模型参数总量：{sum(p.numel() for p in small_gpt.parameters())/1000000:.2f}M\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf6aaae7",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 6. GPT的训练过程\n",
    "\n",
    "GPT的训练包含两个阶段：无监督预训练和有监督微调。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "635930c2",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### 6.1 无监督预训练\n",
    "\n",
    "在无监督预训练阶段，GPT学习预测序列中的下一个词。下面我们实现一个简化的语言建模训练函数：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "ed727c9f",
   "metadata": {},
   "outputs": [],
   "source": [
    "def train_language_model(model, data_loader, optimizer, epochs):\n",
    "    \"\"\"语言模型预训练函数（伪代码）\"\"\"\n",
    "    model.train()\n",
    "    criterion = nn.CrossEntropyLoss()\n",
    "    \n",
    "    for epoch in range(epochs):\n",
    "        total_loss = 0\n",
    "        \n",
    "        for batch in data_loader:\n",
    "            inputs = batch[:, :-1]  # 所有词，除了最后一个\n",
    "            targets = batch[:, 1:]  # 所有词，除了第一个\n",
    "            \n",
    "            # 前向传播\n",
    "            logits = model(inputs)\n",
    "            \n",
    "            # 计算损失：对每个位置，预测下一个词\n",
    "            loss = criterion(logits.reshape(-1, logits.shape[-1]), targets.reshape(-1))\n",
    "            \n",
    "            # 反向传播和优化\n",
    "            optimizer.zero_grad()\n",
    "            loss.backward()\n",
    "            optimizer.step()\n",
    "            \n",
    "            total_loss += loss.item()\n",
    "        \n",
    "        print(f\"Epoch {epoch+1}, Loss: {total_loss/len(data_loader)}\")\n",
    "        \n",
    "# 伪代码示例，不实际运行\n",
    "# optimizer = torch.optim.Adam(small_gpt.parameters(), lr=5e-5)\n",
    "# train_language_model(small_gpt, data_loader, optimizer, epochs=10)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "666bc903",
   "metadata": {},
   "source": [
    "### 6.2 GPT的损失函数详解\n",
    "\n",
    "GPT模型训练的核心是其损失函数设计，它直接影响模型的学习效果。下面我们详细讲解GPT中使用的损失函数：\n",
    "\n",
    "#### 6.2.1 预训练阶段的损失函数\n",
    "\n",
    "在预训练阶段，GPT采用**语言模型损失**（Language Modeling Loss），即**交叉熵损失**（Cross Entropy Loss）。其数学表达式为：\n",
    "\n",
    "$$\\mathcal{L}_{\\text{LM}}(\\theta) = -\\sum_{i=1}^{n} \\log P_{\\theta}(w_i|w_{<i})$$\n",
    "\n",
    "其中：\n",
    "- $w_i$ 是序列中的第i个词\n",
    "- $w_{<i}$ 表示$w_i$之前的所有词\n",
    "- $P_{\\theta}(w_i|w_{<i})$ 是模型预测第i个词的概率\n",
    "- $\\theta$ 代表模型参数\n",
    "\n",
    "具体来说，这个损失函数是在计算模型预测下一个词的负对数似然（negative log-likelihood）。我们希望最大化模型预测正确下一个词的概率，或者等价地，最小化预测错误的概率。\n",
    "\n",
    "#### 6.2.2 损失函数的PyTorch实现\n",
    "\n",
    "在PyTorch中，交叉熵损失通过`nn.CrossEntropyLoss`来实现：\n",
    "\n",
    "```python\n",
    "criterion = nn.CrossEntropyLoss()\n",
    "\n",
    "# 假设:\n",
    "# logits的形状为[batch_size, sequence_length, vocab_size]\n",
    "# targets的形状为[batch_size, sequence_length]\n",
    "\n",
    "# 计算损失时，我们需要调整维度:\n",
    "# - 预测值: [batch_size*sequence_length, vocab_size]\n",
    "# - 目标值: [batch_size*sequence_length]\n",
    "loss = criterion(logits.view(-1, logits.size(-1)), targets.view(-1))\n",
    "```\n",
    "\n",
    "#### 6.2.3 微调阶段的损失函数\n",
    "\n",
    "在有监督微调阶段，GPT根据不同任务采用不同的损失函数：\n",
    "\n",
    "1. **对于分类任务**：交叉熵损失\n",
    "   \n",
    "   $$\\mathcal{L}_{\\text{cls}}(\\theta) = -\\sum_{c=1}^{C} y_c \\log P_{\\theta}(c|x)$$\n",
    "   \n",
    "   其中$C$是类别数，$y_c$是标签的one-hot表示，$P_{\\theta}(c|x)$是模型预测样本$x$属于类别$c$的概率。\n",
    "\n",
    "2. **对于回归任务**：均方误差损失（MSE）\n",
    "   \n",
    "   $$\\mathcal{L}_{\\text{reg}}(\\theta) = \\frac{1}{n}\\sum_{i=1}^{n}(y_i - \\hat{y}_i)^2$$\n",
    "\n",
    "3. **对于多任务学习**：加权损失组合\n",
    "   \n",
    "   GPT-1论文中提出的损失函数组合方式：\n",
    "   \n",
    "   $$\\mathcal{L}_{\\text{total}}(\\theta) = \\mathcal{L}_{\\text{task}}(\\theta) + \\lambda \\cdot \\mathcal{L}_{\\text{LM}}(\\theta)$$\n",
    "   \n",
    "   其中$\\lambda$是权重系数，控制语言模型损失的贡献程度。\n",
    "\n",
    "#### 6.2.4 实际训练中的考量\n",
    "\n",
    "1. **梯度裁剪**：防止梯度爆炸\n",
    "2. **学习率调度**：通常采用线性预热（linear warmup）和余弦衰减（cosine decay）相结合的策略\n",
    "3. **批量累积**（Gradient Accumulation）：在硬件受限时，通过多次前向传播和反向传播累积梯度\n",
    "4. **混合精度训练**：使用FP16和FP32混合精度以加速训练\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81666537",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### 6.3 有监督微调\n",
    "\n",
    "预训练完成后，GPT可以针对特定任务进行微调。以文本分类为例：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dca919b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "class GPTForClassification(nn.Module):\n",
    "    \"\"\"用于分类任务的GPT模型\"\"\"\n",
    "    def __init__(self, pretrained_gpt, num_classes):\n",
    "        super(GPTForClassification, self).__init__()\n",
    "        self.gpt = pretrained_gpt\n",
    "        # 冻结预训练模型参数（可选）\n",
    "        # for param in self.gpt.parameters():\n",
    "        #    param.requires_grad = False\n",
    "        \n",
    "        # 添加分类头\n",
    "        self.classifier = nn.Linear(self.gpt.d_model, num_classes)\n",
    "        \n",
    "    def forward(self, x):\n",
    "        # 获取GPT的输出\n",
    "        hidden_states = self.gpt(x)\n",
    "        # 使用最后一个时间步的隐藏状态进行分类\n",
    "        pooled_output = hidden_states[:, -1]\n",
    "        # 分类层\n",
    "        logits = self.classifier(pooled_output)\n",
    "        return logits\n",
    "\n",
    "def finetune_classifier(model, data_loader, optimizer, epochs):\n",
    "    \"\"\"分类任务微调函数（伪代码）\"\"\"\n",
    "    model.train()\n",
    "    criterion = nn.CrossEntropyLoss()\n",
    "    \n",
    "    for epoch in range(epochs):\n",
    "        total_loss = 0\n",
    "        correct = 0\n",
    "        total = 0\n",
    "        \n",
    "        for inputs, labels in data_loader:\n",
    "            # 前向传播\n",
    "            logits = model(inputs)\n",
    "            \n",
    "            # 计算损失\n",
    "            loss = criterion(logits, labels)\n",
    "            \n",
    "            # 反向传播和优化\n",
    "            optimizer.zero_grad()\n",
    "            loss.backward()\n",
    "            optimizer.step()\n",
    "            \n",
    "            total_loss += loss.item()\n",
    "            \n",
    "            # 计算准确率\n",
    "            _, predicted = torch.max(logits, 1)\n",
    "            total += labels.size(0)\n",
    "            correct += (predicted == labels).sum().item()\n",
    "        \n",
    "        accuracy = 100 * correct / total\n",
    "        print(f\"Epoch {epoch+1}, Loss: {total_loss/len(data_loader)}, Accuracy: {accuracy}%\")\n",
    "\n",
    "# 伪代码示例，不实际运行\n",
    "# classifier_model = GPTForClassification(small_gpt, num_classes=2)\n",
    "# optimizer = torch.optim.Adam(classifier_model.parameters(), lr=2e-5)\n",
    "# finetune_classifier(classifier_model, data_loader, optimizer, epochs=3)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80338d42",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 7. 使用Hugging Face的预训练GPT模型\n",
    "\n",
    "在实际应用中，我们可以直接使用Hugging Face提供的预训练GPT模型，避免从头训练的巨大成本。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "0f0ff8a6",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "输入: Apple is tasty. I like to  \n",
      "生成: Apple is tasty. I like to   have a good time with my friends.\n",
      "I'm not sure if I'm going to be able to get my hands on a new iPhone 6 or 6 Plus, but I'll be sure to check\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    from transformers import GPT2LMHeadModel, GPT2Tokenizer\n",
    "    \n",
    "    # 加载预训练的GPT-2模型和分词器\n",
    "    model_name = \"gpt2\"  # 英文模型\n",
    "    # 对于中文，可以使用：\"uer/gpt2-chinese-cluecorpussmall\"\n",
    "    \n",
    "    tokenizer = GPT2Tokenizer.from_pretrained(model_name)\n",
    "    model = GPT2LMHeadModel.from_pretrained(model_name)\n",
    "    \n",
    "    # 准备输入文本\n",
    "    text = \"Apple is tasty. I like to  \"\n",
    "    inputs = tokenizer(text, return_tensors=\"pt\")\n",
    "    \n",
    "    # 生成文本\n",
    "    outputs = model.generate(\n",
    "        inputs[\"input_ids\"],\n",
    "        max_length=50,\n",
    "        num_return_sequences=1,\n",
    "        no_repeat_ngram_size=2,\n",
    "        temperature=0.7,\n",
    "    )\n",
    "    \n",
    "    # 解码生成的文本\n",
    "    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
    "    print(f\"输入: {text}\")\n",
    "    print(f\"生成: {generated_text}\")\n",
    "    \n",
    "except ImportError:\n",
    "    print(\"请安装transformers库：pip install transformers\")\n",
    "except Exception as e:\n",
    "    print(f\"运行出错: {e}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d7401ea",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 8. GPT与BERT的对比\n",
    "\n",
    "为了更好地为下一节BERT模型学习做准备，让我们比较一下GPT和BERT的主要区别：\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "786fffe9",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "### GPT vs BERT 关键区别对比\n",
    "\n",
    "| 特性 | GPT | BERT |\n",
    "|------|-----|------|\n",
    "| **架构** | 仅使用Transformer解码器 | 仅使用Transformer编码器 |\n",
    "| **注意力机制** | 单向注意力(掩码自注意力) | 双向注意力(完全自注意力) |\n",
    "| **预训练目标** | 自回归语言建模(预测下一个词) | 掩码语言建模(预测被掩盖的词)+下一句预测 |\n",
    "| **信息流向** | 单向(只能看到左侧词) | 双向(可以同时看到左右两侧词) |\n",
    "| **适合任务** | 生成式任务(文本生成、对话) | 理解式任务(分类、问答、命名实体识别) |\n",
    "| **上下文感知** | 有限(只能考虑前面的词) | 完整(可以考虑整个句子) |\n",
    "\n",
    "\n",
    "### GPT的局限性（BERT如何改进）\n",
    "\n",
    "1. **单向语境限制**：GPT只能看到前面的词，无法获取完整的语境信息\n",
    "2. **预训练任务单一**：仅使用自回归语言建模，没有多任务学习\n",
    "3. **不适合理解型任务**：在需要整体语义理解的任务上表现较弱\n",
    "\n",
    "### BERT的改进\n",
    "\n",
    "1. **双向注意力**：使用掩码语言模型，允许模型看到左右两侧的词\n",
    "2. **多任务预训练**：结合掩码语言建模和下一句预测两个预训练任务\n",
    "3. **特殊标记**：引入[CLS]和[SEP]等特殊标记，更好地处理不同任务\n",
    "\n",
    "正是由于BERT的这些创新，使它在理解型任务上表现更为出色，成为NLP领域的里程碑。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6daa110",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 9. 总结\n",
    "\n",
    "通过本教程，我们了解了GPT模型的核心原理和实现：\n",
    "\n",
    "1. **GPT核心思想**：基于Transformer解码器的生成式预训练语言模型，采用自回归预训练和有监督微调。这种架构叫做 Decoder-Only 架构。\n",
    "\n",
    "2. **架构特点**：\n",
    "   - 单向掩码自注意力机制\n",
    "   - 仅使用Transformer的解码器部分\n",
    "   - 层叠的多头注意力和前馈网络\n",
    "\n",
    "3. **训练范式**：\n",
    "   - 无监督预训练：在大规模文本上进行自回归语言建模\n",
    "   - 有监督微调：针对特定任务进行模型参数优化\n",
    "\n",
    "4. **与BERT的区别**：\n",
    "   - GPT：单向注意力，擅长生成\n",
    "   - BERT：双向注意力，擅长理解\n",
    "\n",
    "5. **应用场景**：\n",
    "   - 文本生成\n",
    "   - 对话系统\n",
    "   - 内容创作\n",
    "\n",
    "在下一节课中，我们将深入学习BERT模型，了解它如何通过双向注意力和掩码语言建模更好地捕捉语言的语义信息，并在各种理解型任务上取得突破性的表现。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e00a919",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## 10. 参考资料\n",
    "\n",
    "1. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.\n",
    "\n",
    "2. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners.\n",
    "\n",
    "3. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165.\n",
    "\n",
    "4. Hugging Face Transformers 文档: https://huggingface.co/transformers/\n",
    "\n",
    "5. \"The Illustrated GPT-2\" by Jay Alammar: https://jalammar.github.io/illustrated-gpt2/\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llm-algorithm (3.13.7)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
